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ABSTRACT 



Context. We address the issue of controling the systematic errors in shape measurements for weak gravitational lensing. 
Aims. We make a step to quantify the impact of systematic errors in modeUng the point spread function (PSF) of 
observations, on the determination of cosmological parameters from cosmic shear. 

Methods. We explore the impact of PSF fitting errors on cosmic shear measurements using the concepts of complex- 
ity and sparsity. Complexity, introduced in a previous paper, characterizes the number of degrees of freedom of the 
PSF. For instance, fitting an underlying PSF with a model of low complexity produces small statistical errors on the 
model parameters, although these parameters could be affected by large biases. Alternatively, fitting a large number 
of parameters (i.e. a high complexity) tends to reduce biases at the expense of increasing the statistical errors. We 
attempt to find a trade-off between scatters and biases by studying the mean squared error of a PSF model. We also 
characterize the model sparsity, which describes how efficiently the model is able to represent the underlying PSF using 
a limited number of free parameters. We present the general case and give an illustration for a realistic example of a 
PSF modeled by a shapelet basis set. 

Results. We derive a relation between the complexity and the sparsity of the PSF model, the signal-to-noise ratio of 
stars and the systematic errors in the cosmological parameters. By insisting that the systematic errors are below the 
statistical uncertainties, we derive a relation between the number of stars required to calibrate the PSF and the sparsity 
of the PSF model. We discuss the impact of our results for current and future cosmic shear surveys. In the typical 
case where the sparsity can be represented by a power-law function of the complexity, we demonstrate that current 
ground-based surveys can calibrate the PSF with few stars, while future surveys will require hard constraints on the 
sparsity in order to calibrate the PSF with 50 stars. 
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1. Introduction 

Studying spatial correlations between galaxy shapes in- 
duced by gravitational lensing of the large scale structure 
('Cosmic Shear') is a powerful probe of dark energy and 
dark matter. A number of current and planned surveys are 
dedicated to cosmic shear, such as: the Canada-France- 
Hawaii-Telescope Legacy Surve£l (CFHTLS), the Kilo 
Degree Survey and the VISTA Kilo-Degree Infrared Galaxy 
Surveys (KIDS/ VIKING), the Dark Energy Surve}0 
(DES), the Panoramic Survey Telescope & Rapid Response 
Systenp (Pan-STARRS), the SuperNovae Acceleration 
Probes (SNAP), the Large Synoptic Survey Telescop^ 
(LSST) and the Dark UNiverse ExploreiQ (DUNE/Euclid). 



http:/ /www. cfht.hawaii.edu/Science/CFHLS/ 
http:/ /www. eso.org/sci/observing/policies/ 
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and 



http:/ /www.esa.int/esaCP/index.html 



The most efficient way of improving the statistical preci- 
sion of cosmic shear analyses is by enlarging the surveys. As 
long as the median redshift is sufficiently high {z > 0.7), 
lAmara & Refregierj ()2007af ) demonstrate that, it is more 
advantageous to perform the cosmic shear surveys in fields 
as wide as possible, rather than deep. This in order to mini- 
mize the error bars in cosmological parameters. To date, the 
largest data optimized for cosmic shear is the Wide field of 
the CFHTLS, which covers 50 deg^ (|Fu et aLll2QQ8[ ) and wih 
eventually reach 170 deg^. Another analysis has also been 
published that co mbines 4 surveys that, together, cover an 
area of lOOdeg^ (Benj amm et all 1200/). In a few years, 
the KIDS/VIKING survey wih cover 1500 deg^. Eventually, 
projects currently being planned, such as DUNE/Euclid, 
planned for 2017, will be able to perform cosmic shear 
measurements over the entire observable extragalactic sky 
(- 20,000deg2). 

To let cosmic shear surveys reaching their full poten- 
tial, it is necessary to ensure that systematic errors are 
sub-dominant relative to statistical uncertainties. In par- 
ticular, a tight control on all the effects associated with 
shape measurements is required. To illustrate the difficulty 
in compiling accurate shear measurements, we begin with 



an overview of the 'forward process' that ihustrates how the 
original image of a galaxy is distorted in forming the final 
image that we measure. In the forward process, a galaxy 
image is: (i) sheared by gravitational lensing; (ii) convolved 
with a point spread function (PSF) originating in a number 
of sources (e.g. instruments and atmosphere); (iii) pixelated 
at the detector; and finally (iv) affected by noise. 

Cosmic shear analyses involve the reverse process: we 
begin with the final image and move backwards from step 
(iv) to (i) in recovering the original lensing effect. A de- 
tailed and illustrated description of the forward and inverse 
processes is given in the GREAT08 Challenge Handbook 
(|Bridle et al.l l2008h. The GREAT08 Challenge aims to pro- 
vide a wide range of expertise into gravitational lensing by 
presenting the relevant issues in a clear way so as to make 
it accessible to a broad range of disciplines, including the 
machine-learning and signal-processing communities. Other 
similar challenges have also been performed within the 
weak lensing communi t y, as part of th e STE P collaboration 
(iHevmans et al.l I2QQ6I : iMassev et all I2QQ7I : iRhodes et all 
l2QQ8t ). which focused mainly on understanding the system- 
atic errors at play in current shear measurement methods. 
These challenges focus mainly on reducing the errors orig- 
inating in the shape measurement method. However, even 
with a perfect method there are fundamental limits due the 
statistical potential of a data set. 

In Paulin-Henriksson et ahl (|2QQ8[ ) (PI hereafter), we in- 
vestigated the link between systematic errors in the power 
spectrum and uncertainties in the PSF correction phase. 
The framework is the following. Since the PSF of an in- 
strument varies on all scales, the PSF needs to be measured 
using the stars that surround the lensed galaxy. Each star 
provides an image of the PSF that is pixelated and noisy, 
which means that to reach a given accuracy in the knowl- 
edge of the PSF, a number of stars is required. We estimate 
the number of stars required to calibrate the PSF to a 
given accuracy, according to the stellar signal-to-noise ra- 
tio (SNR), the minimum galaxy size, the complexity of the 
PSF and the tolerated variance in the systematics cTgyg. On 
the other hand, lAmara fc Refregieil (| 2 00 7b) estimated the 
upper limit to cTg^g when estimating cosmological param- 
eters. By combining both papers together, we derive the 
minimum number of stars required to reach a given accu- 
racy. For instance, analyses completed to date, that allow 
us to constrain as with an accuracy of 0.05, require dgyg 
lower than a few 10~^ and the PSF to be calibrated by using 
5 stars; while for future ambitious surveys that will allow 
us to constrain wq and Wa to an accuracy of 0.02 and 0.1, 
respectively, <jgyg must be lower than 10~^, which requires 
at least 50 stars (for stars with signal-to-noise ratio of 500 
and a PSF described by a few degrees of freedom, as can 
be typically achieved in space). 

In PI we use the same functional form for both the un- 
derlying PSF as well as the model used to fit it. This means 
that the PSF model is able to describe perfectly the under- 
lying PSF. The errors in the fit due to noise causes a scatter 
of the fitted parameters around the truth. For instance, if 
the model is an orthogonal basis set, then the fitted pa- 
rameters follow a Gaussian distribution around the truth. 
In this paper, we extend this investigation by studying the 
impact of fitting a PSF with a model that has a different 
form. This addresses the case in which the underlying PSF 
(unknown in practice) is estimated by fitting the parame- 



ters of an arbitrary model. This can lead to both a scatter 
in the fitting parameters and an offset in the average value 
relative to the true value, i.e. a bias in the fitting param- 
eters. We can therefore model a given PSF using either a 
complex model of small biases but large scatters, or a sim- 
pler model that would lead to smaller scatters but larger 
biases. To quantify these effects, we revisit the concept of 
complexity proposed in PI and introduce the concept of 
sparsity. 

This paper is organised as follow: first, in Sect. [21 we 
discuss the concepts of complexity and sparsity, which are 
the key concepts of this paper; Sect. [3] presents our nota- 
tion; Sect.jH presents the definition of optimal complexity, 
illustrates our formalism with a PSF example and uses the 
sparsity as a tool for optimizing the complexity; Sect. [5] de- 
rives the minimum number of stars required to calibrate 
the PSF, extending results of PI; and finally, sect. [6] sum- 
marizes our conclusions. 

2. Complexity and sparsity 

In PI, we introduced the concept of complexity; we demon- 
strated that a few complexity factors characterize the 
amount of information that needs to be collected about 
the PSF. This is summarized and revisited in Sect. 12. li In 
Sect. 12. 2[ we introduce the concept of sparsity, which mea- 
sures the ability of a PSF model to represent the underlying 
PSF with a small number of free parameters. This allows us 
to explore how an optimal PSF model can be constructed 
to minimize (Tsys- 

2.1. Complexity 

In PI, we define the complexity factors of the PSF, which 
represent the number of degrees of freedom (DoF) that are 
estimated from stars (in the limit of infinite resolution, i.e. 
infinitely small pixels): the higher the number of DoF, the 
larger the complexity factors. Each PSF shape parameter 
is associated with a complexity factor that is related to 
the rms of its estimator. In the simple formalism where 
we consider unweighted quadrupole moments, the PSF is 
characterized by only two complexity factors ipe and '0j^2 
associated with the 2 component PSF ellipticity €psf and 
the square PSF rms radius Rp^p respectively (as defined in 
PI). For a given star, one has: 

( '^f)^S. ( ^[%]/^PSF^ (1) 
\ J \ Cr[epSF] J ^ ^ 

where 5* is the photometric SNR of the star, cr[Rp^p] is the 
rms error of the PSF size estimator and cr[epsF] is the rms 
error of the PSF ellipticity estimator. As in PI, we assume 
that the small ellipticity regime holds (i.e. |€psf| ^0.1), im- 
plying that the measure of €psf is isotropic and the 2 com- 
ponents of the ellipticity have the same rms uncertainty, 
i.e. cr[epsF,i] = cr[epsF]. 

If the PSF can be considered constant over several stars, 
or for particular representations of the PSF (for example 
with shapelet basis sets in the small ellipticity regime, see 
PI), iIjji2 and i/j^ are spatially constant and Eq. [T] can be 
extended to a set of several stars. For a combination of 
several stars, 5* becomes ^/n^Sef^'. 

(t)-^/i:&.('l1fif-). (2) 



where 5eff is the effective stehar SNR and is the effective 
number of stars k used in the PSF cahbration, as defined 
by (see PI): 

n.S!s^^Sl (3) 

k 

We also sh ow in PI that the polar shapelet basis set, 
proposed bv Massev fc Refregierl (|2QQ5f ). tested on simu- 
lated data in lMassev et al. I (|2007h and used on real data by 
iBerge et alJ (j2QQ 8^. is particular Iv convenient for modeling 
the PSF. For example, in the small ellipticity regime, tpe 
and ?/^^2 depend only on the polar shapelet basis set over 
which the PSF is decomposed, not on the PSF itself. For 
this reason we use shapelets in this paper when illustrating 
our discussions by an example. Note that our results and 
conclusions are not restricted to shapelets but remain valid 
whatever the PSF model. For convenience, we choose the 
shapelet 'diamond' option (described in details in PI) that 
imposes a lower limit to the scales described, implying a 
link between ip^ and ?/^^2 : 



(4) 



with M the highest even integer lower than or equal to the 
order nmax of the basis set. We then consider the overall 
complexity ^ defined in the following according to V^e, 
and the variance in the galaxy ellipticity distribution. 

2.2. Sparsity 

In this paper, we introduce the concept of sparsity of the 
PSF model, which describes how efficiently a model can 
represent the underlying PSF with a limited number of 
DoF (i.e. with a limited complexity). Specifically, the spar- 
sity quantifies how the residuals between the estimated and 
the underlying PSF decrease as the complexity of the PSF 
model increases. With a high number of DoF, i.e. a high 
complexity, one might expect small residuals but large scat- 
ters ointhe fitted parameters. On the other hand, with a 
small number of DoF, i.e. a low complexity, one might ex- 
pect large residuals but small scatters in the fitted param- 
eters. The sparsity characterizes the slope in this relation 
and thus is an estimate of the amount of information that 
can be contained in a given number of DoF. We show how 
to use sparsity in optimizing the complexity and minimizing 

Consider the shape parameters Rp^F ^psf of the 
underlying PSF, as defined previously. The differences 
(5(i?pgp) and ^€psf between the underlying PSF ('true' in- 
dex) and its estimation ('est' index) can be written: 



6^ 



psf) 



Rl 



(5) 



These differences are of two types: the statistical scatter 
relative to the average a and the bias-offset h of the average 
relative to the true value. The mean square errors (MSE) 
of i^psp and epsF are: 



MSE[i?|sF] 
MSE[epsF,i] 



^'[^PSF]+^'[^i>SF], (6) 
^'[epSF,^] +^'[6PSF,^] withi =1,2. (7) 



In PI, we address the zero bias case h[ei] = h[R^] = 0, that 
is equivalent to consider the PSF model is able to describe 
the underlying PSF perfectly. However, this nulling of the 



biases is not necessarily the optimal PSF modeling. It can 
be advantageous to work with a simplistic PSF model that 
is unable to describe all the PSF features and has some 
biases but low statistical scatter (see our PSF example in 
Sect. l4.1l and Fig.[T]). This paper proposes another approach 
that consists in optimizing the PSF model in order to mini- 
mize cTgys. We do this by searching for the optimal trade-off 
between the systematic errors (6[ei], ^[62], and h[R^]) and 
the statistical errors (cr[e], and cr[i?^]), which is equivalent 
to searching the optimal complexity ^ of the PSF model. 
The sparsity allows us to perfom this optimization because 
it characterizes the decrease in the biases 6 as ^ increases. 

In the following, we define a 'sparsity parameter' a in 
the particular case where the biases are modeled as a power- 
law function of the PSF model complexity {B cx 1/^"), and 
we study the impact of a on the number of stars required 
to calibrate the PSF. We thus revisit the main result of PI 
by deriving (the number of stars required to calibrate 
the PSF) according to a instead of ^ (the complexity of the 
underlying PSF). Moreover, this new relation is optimized 
to minimize cr^y^. 

We emphasize that, in this paper, we propose to opti- 
mize the complexity of the PSF model within a given the 
basis set. We do not address the issue of choosing the basis 
set itself. There is no doubt that, to optimize the PSF mod- 
eling, it is necessary to select carefully this basis set. For 
instance, generic basis sets such as shapelets, wavelets, or 
Fourier modes, although they have enormous advantages, 
are not optimal. This issue will be addressed in forthcoming 
works. 



3. PSF calibration for shear measurement 

When deconvolving the observed galaxy with the estimated 
PSF, ^(i^psp) ^€PSF propagate into an error ^€gai in 
the estimation of the galaxy ellipticity. We denote i?gai and 
egai, the rms radius and the two-component ellipticity of 
the galaxy. When i^psF, ^psf, ^gai, and €gai are defined 
using the unweighted moments of the flux, this propagates 
to (see PI): 



sys 



(Cgal — epsp) 



^ (-^psf) 



V ^gal ) 



^€PSF . (8) 



The spatial average of is related to the variance 

in the systematic errors in the shear measurements dgyg 
(|Amara fc Refregierl l2007bl ). defined by the integral of the 
systematics C^^ in the power spectrum: 



2 _ 



^ I d{ln £)e{£+l) \Cr\ 



sal 



(9) 
(10) 



with the calibration factor between the gravitational 
shear and the ellipticity. Its value depends on the distri- 
bution of galaxy ellipticities and is typically about 1.84 
(Rhodes et al. 2000). The brackets < > denote a spatial 
average over the entire field. As in PI, we substitute Eq. [8] 
into Eq. [To] with the following simplifying assumptions: 

1. The galaxy is not correlated with the PSF. 



2. The error on the PSF ehipticity (^€psf) and the PSF 
ehipticity itself (epsr) are not correlated. This is war- 
ranted by the fact that, in the assumed small ehipticity 
regime, ^€psf does not have any preferred direction, 
implying that < 6psf-^€psf >= 0. 

3. We neglect correlations between the ehipticity and the 
inverse squared radius of the galaxy. This is reasonable 
for the PSF calibration in the small ehipticity regime. 

With these simplifications, one can substitute Eq. [8] into 
Eq. [TO]to obtain: 



sys 



-2 f RvSy \ 

\ V ^gai ; 



^€p 



SF 



€psf| 



\5R 



2\ -, 



PSF I 



PSF/ 



(11) 



We develop a more compact expression by adopting the 
following notation: 



be denoted B(^) and i;(^,5eff). In PI, we show that it is 
given by: 

S(^,5eff) = ^^, (20) 



eff 



where the overall complexity of the model ^ is given by the 
complexities ipe and ?/^^2 associated with the ehipticity and 
the squared radius of the model, respectively (see Eq. [2]): 



Equations [T9l and [2Ql then infer that: 



sys 



^2 



(21) 



(22) 



From this equation, we can see that increasing the complex- 
ity ^ by adding degrees of freedom in the PSF model can re- 
duce B but also increases the statistical errors. Minimizing 
cTgyg thus implies the search for the optimal trade-off in the 
value of ^. 




PSF I 



(12) 
(13) 

(14) 



4. Optimal PSF model 



In Sect. 14. H we present a PSF example that we use in 
the remainder of this paper to illustrate our discussion. 
In Sect. 14. 2[ we show the optimal complexity of the PSF 
model (that minimizes cTgys) and apply this to the PSF ex- 
ample. We then explore this optimization in more details in 
Sect. 14. 31 by examining a particular case in which the bias 
can be described by a power-law function of the complexity. 



In PI, we considered only the scatters cf[R^q^y\ 
c'"[epsF] (i-e. in the zero bias case: h[ei\ = = 0) and we 
approximate the statistical averages with spatial averages: 



.2rzp2 and a2[6psF,z] ^ (|^6psF,zr). In 

this paper, with the introduction of biases, the scatter be- 
comes the MSE: 



■psf] 



MSE[i?- 
MSE[epsF, 



I^^psfI 



(15) 



PSF,^ 



This leads to: 

^2 



(J. 



sys 



C[62[epsF,i] 
+^^[epsF,2] ^ 

{Pin- 



— 



h cr^[epsF,l] 

c^^[epsF,2] 



psf] 



psf]) 



(16) 



We can see that dgyg is proportional to the quadratic sum 
of 6 terms: three bias terms and three statistical ones. 
Collecting terms of similar type using the following nota- 
tion: 



B 

gives: 



PSFJ 



i?pgp 



a [e^ \+cr \+£ 



psfJ 



4, ^C[B + S] 



(17) 
(18) 

(19) 



Although only S depends on the SNR of the stars, B and S 
both depend on the complexity of the modeling. They can 



4.1. PSF example 

To illustrate our discussion, we study a realistic example 
of a PSF with complex features in the tails, and investi- 
gate what happens when fitting it with various shapelet 
basis sets as function of the SNR of the available stars. We 
also use a shapelet basis set (which differs from that used 
in the fits) for describing the underlying PSF. This use of 
shapelets for both the PSF model and the underlying PSF 
was chosen for three reasons: 

— first, it allows pixelation issues to be ignored, which are 
beyond the scope of this paper. Indeed, the description 
of the underlying PSF is performed by using the con- 
tinuous shapelet functions and the fits are performed at 
high resolution; 

— second, it considerably simplifies both the calculations 
and the fitting process, due to the orthogonality of 
shapelet functions (the average estimation of a fitted 
coefficient is the true value, independently of the other 
coefficients); 

— third, it is a simple and convenient framework for illus- 
trating the use of sparsity as a tool in optimizing the 
complexity of the PSF modeling. 

Our example of an underlying PSF is constructed using 
^max = 34 (with the diamond option), as shown in Fig. [TJ 
In Fig. [TJ we also show the 16 fits performed with the 4 
shapelet basis sets (corresponding to nmax = 4, 6, 10, and 
20) and y^5eff (the stellar signal-to-noise ratio, see Eq. 
[31) equal to 100, 10^, 10^, or oo (the latter is the ideal case 
of no background). To determine the overall complexity ^, 
which depends on the rms of galaxy ellipticities in terms of 



original 




Fig. 1. PSF example (top panel) adopted in this paper and best fits of it (other panels) with 4 shapelet basis sets 
(corresponding to ^fit =2.6, 4.3, 7.8 and 16.4, i.e. nmax =4, 6, 10 and 20 with the diamond option) and for ^/n^Se^^ 
equal to 100, 10^, 10^ and oo (n* is the numbers of stars used for the fit and <Seff is the effective SNR of stars, see Eq. [3j 
infinity corresponds to the ideal case of no background). For a given value of y^5eff, all fits are performed with the same 
realization of the noise. Colors show the fiux (darker colors indicate brighter regions) and show that this PSF is almost 
circular at the center. Contours show some isophotes not visible with the color scale and reveal the complex structure of 
the tails. The original (i.e. underlying) PSF in the top panel was built using a model with ^ =24.8. The optimal value 
^opt of the fitted complexity (in order to minimize cTgys) is indicated under brackets for each value of y^5eff. This figure 
illustrates that, for a given value of ^/n^Se^^^ the simpler the model (i.e. the lower ^fit), the poorer the description of the 
tails and the larger the bias. On the other hand, for a given ^^t, the lower the amount of information available in stars 
(i.e. the smaller y^iSeff), the noisier the description of the tails. 



the parameter £ (see Eqs. [T3land [2T]) . we adopt the typical 
value f = 0.2, for which nmax = 4, 6, 10, 20, 34 correspond 
to ^ =2.6, 4.3, 7.8, 16.4, 28.4 respectively. In the follow- 
ing, we also adopt the value C = 0.066, that corresponds to 

.1/4 



the typical values = 1.84 and 
Fig. [T] illustrates that: 



\RPSF J / 



= 1.5. 



imal value of cTgyg and in the same spirit, we note that ^opt 
is the corresponding value of ^: 



(jg^^g* = cTgys such that 



(23) 



For instance. Fig. [2] illustrates the search for the optimal 
shapelet basis when our PSF example (see the previous 
section) is estimated with 50 stars (and with 5eff = 1000). 
The optimal model is that corresponding to ^opt — 6 (i.e. 
^max = 8 with the diamond option) and (cr^ys)'^ — 10~^, 
shown by the red diamond. 

For a given fit model, increasing reduces the scat- 
ters but not the biases in the model fitting (see Eq. [22|) . 
Therefore, as increases, ^opt increases and a^^^ de- 
creases. This is illustrated in Fig.[3l which shows 5, S, and 
CTgys (see Eqs.[T7lto [T9|) when our PSF example is estimated 
with =10, 50, or 200 (stih 5eff = 1000). 

Fig. [4] shows (cTgPg*) as a function of n^. The diamonds 
represent the curve for our PSF example illustrated in all 
previous plots, while the bold-straight line without any di- 
amond shows the ideal case (addressed in PI) of a PSF de- 
scribed perfectly by the model (i.e. B = 0). Thus, (csys) 
varies with as predicted by our scaling relation pre- 

sented in PI. The dotted and dashed lines are discussed in 
Sect. [O 



4.3. Example of optimal complexity in the case of a 
power- law function 

In this section, we derive the optimal complexity when the 
bias 5 is a power-law function of the complexity written as 

B (X 1/^". We investigate (cTgys) ^ function of and 
a. We normalize the power-law function, such as: 



(24) 



— when s/n^Sei^ is sufficiently high, complex basis sets are 
required to model the complex tails, i.e. the amount 
of bias B decreases as the complexity ^ of the model 
increases. For instance, a fit with 5eff = oo and ^ = 28.4 
would allow one to recover our PSF example exactly, 
with B = 0. 

— a higher complexity requires a higher number of DoF 
to be fitted. Consequently, for a given value of y^^Seff, 
increasing the complexity of the model also increases 
the scatter in the estimated shape. Therefore, it is not 
always appropriate to use a complex fit model; it may 
be more robust to use a simplified (but more biased) fit 
model. 

4.2. Optimizing the complexity of the PSF model 

The optimal PSF model is that for which osys is minimized, 
varying ^. We define the optimal value a^^^ to be the min- ^^g* 2 




8 10 



10 



12 14 16 11 



Total variance dgyg and its contributions (see Eq. 
[T6|) with respect to the fit model complexity ^, for our PSF 
example (see Sect. 14. II and Fig.[T]) fitted with 50 stars (i.e. 
n* = 50), and 5eff = 1000. We note K[R^] = ^/£h[R^]/R^ 
and a^[R^] = cr[R^] / R^ . The optimal basis set is that 
for which (jgys is minimum. At this point, shown by the red 
diamond, dsys ct^p* lO"'^ and ^ ^opt - 6. 



In our example of a PSF fitted with a shapelet basis set 
(see Sect. 14.11 and Fig. [T]), the smallest value of ^ that 
we consider is 2.6 (which corresponds to nmax = 4 with the 
'diamond' configuration, see PI). This explains why, for this 
example, we choose to normalize the power-law function to 
this value = 2.6, implying = 2 x 10~^. This model 
is illustrated in Fig. [5] for a equal to 2, 4, 6, and 8. This 
is superimposed on the B versus ^ relation that we obtain 
when fitting our PSF example with different shapelet basis- 
sets, as decribed in Sect. 14. li We see that in this case B is 
reasonably described by a = 4. 

This representation by a power-law function is particu- 
larly convenient because a can be identified with the spar- 
sity: a high value of a means that the PSF model is efficient 
in representing the underlying PSF with a small number of 
free parameters. Conversely, a low value of a means the PSF 
model requires a large number of parameters to describe the 
underlying PSF without large residuals. In the following, a 
is called the 'sparsity parameter'. Together with this power- 
law representation (Eq.[24j), Eqs. [22l and [23l implv that: 
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Fig. 3. Total variance dgyg (3 thick and solid colored 
curves) and its contributions to B (one thin and solid black 
curve) and S (3 dashed colored curves), as defined by Eqs. 
[T71 to [ini with respect to the PSF model complexity ^, 
for our PSF example (presented in Sect. 14. II and Fig.[T]) fit- 
ted with 10 (red curves), 50 (green curves) or 200 (blue 
curves) stars (ie. =10, 50, 200) with an effective SNR 
of 5eff = 1000. S depends on (see Eq. [2Q|) . implying a 
different curve for each value of n*, while B does not. This 
illustrates that a^^^ (i.e. the minimum value of a^ys shown 
by the diamonds) increases with n*. 

Note that Eq . [27l expresses (crgPg*)^ (the minimum variance 
in the systematic errors in shear measurements that can 
be achieved, see Eqs. [H [TOl and [23l) in terms of a set of 
parameters that can be divided into 2 families: 

1. parameters that are properties of the data set, such as 
C, Bo, ^0, and 5eff. 

2. parameters that are properties of the analysis method, 
such as (i.e. the number of stars used to calibrate 
the PSF) and a (i.e. the sparsity parameter of the PSF 
model). 

When analysing a given data set, the parameters in the first 
family are kept fixed. The only parameters that can vary 
to achieve optimization during the analysis are those in the 

second family (i.e. and a). For a given n*, (csys*) is 
proportional to q/~"/("+2). 

5. Required number of stars 

As discussed in the introduction, an important issue for 
cosmic shear surveys is to ensure that systematics are kept 
smaller than the statistical errors, by demanding an upper 
limit to cTgys- Part of the systematics have their origin in 
the PSF calibration, which is imperfect due to the limited 
number of stars available. In this section, we express A^*, 



Fig. 4. Optimal variance of the shape measurement sys- 
tematics (cTgys*) ^ function of the number of stars n* 
used to calibrate the PSF. The diamonds show the curve 
for our PSF example illustrated in all previous plots (pre- 
sented on figure [1]) and fitted with shapelets. The straight 
bold line shows the ideal case (addressed in PI) of a PSF 
model that exactly describes the PSF (i.e. with no residual). 

The horizontal line shows the values (cgys*) = 10~^ which 
is the requirement to be able to constraint wq and Wa at 
0.02 and 0.1 respectively (Amara & Refregier 2007b). The 
blue lines (dashed, dotted and dotted-dashed) are discussed 
in section [43l They are the curves expected when modeling 
the bias B with a power-law function of the complexity as 
stated by Eq. [24l and illustrated in Fig. [5l 



the number of stars required to calibrate the PSF, in terms 
of the level of systematic errors a^ys (note the capital 'N', 
as opposed to ^n^^ which is the number of stars involved in 
the PSF calibration process: we need > to ensure 
that systematic effects are below a^ys- A^* is the lower limit 
of n*). 

In Sect. 15. H we summarize the conclusions of PI which 
apply when the underlying PSF and the PSF model have 
the same functional form (i.e. B = 0) and we extend these 
conclusions to the general case of PSF modeling performed 
with any model (i.e. B not necessarily equal to 0). In 
Sect. 15. 21 we invert Eq.[27| (that holds when B is described 
by a power-law function of the complexity: B ex 1/^") and 
express A"^ as a function of a and of the minimum system- 
atic level cTgPg* achievable when the complexity of the PSF 
modeling is optimal. 

5.1. Generalised scaling relation 

In the optimistic case where the PSF calibration is the only 
significant source of systematic errors, a given value of A^* 
(i.e. a given number of stars involved in the PSF calibra- 
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Fig. 5. Overall bias B versus ^ for our PSF example (in 
black) and some power-law functions B ex 1/^", a =2, 4, 
6, 8 (in blue) normalized to intercept at ^ = 1.6. We see 
that the case a = 4 fits well with the example. 



tion) implies a value of cTgys. This is presented in PI in the 
form of a scaling relation that links A^*, cTgys, 5eff (the ef- 
fective signal-to-noise ratio of stars), {Rga\/ Rpsf)^^^ (the 
ratio between the smallest galaxy size and the PSF size), 
and ^ (the complexity of the PSF): 




The factor 2 at the end comes from the fact that :^ 2ip'^ 
(in PI, this scaling relation is written in terms of tpe)- This 
holds for the assumption that the PSF model is able to 
describe the PSF without any bias (i.e. B = 0). With non- 
zero B and adopting the same simplifications and the same 
typical values as in PI, Eq. [22] leads to the more general 
relation: 



where 
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Thus, taking B into account in the scaling relation trans- 



lates into the new factor 1/ 



in Eq. [30l which 



equals 1 when B is zero (then the relation [29] is equivalent 
to the scaling relation given in PI) and is related to the 
ratio -P- , which is the relative weight of biases in the error 

^sys 

budget. 



Fig. 6. h{a) as defined in Eq. [32] 



5.2. Application to the power- law model 

Eq. [27] can be inverted to provide (the number of stars 
required to calibrate the PSF) as a function of a^^^ (the 
minimum level of systematics achievable when optimizing 
the complexity of the PSF modehng), a (the sparsity pa- 
rameter), tSeff (the effective SNR of stars defined in Eq. [3]), 
and C (a dimensionless factor defined in Eq. [T2|) : 



^2 TD2/a 
^eff 



(crsys)\ 



(1+2/a) 



(31) 



with the dimensionless function: 
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shown in Fig. [6] With the notation and scaling of Eq. | 
Eq. [3T]is equivalent to: 
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This equation allows one to estimate the number of stars 
required to calibrate the PSF and thus, with respect to 
the stellar density, the minimum scale on which the PSF 
calibration is possible. On smaller scales, stars provide in- 
sufficient information for calibrating PSF variations. This 
implies that these smaller scales may be contaminated by 
systematics due to a poor correction of the PSF and should 
not be used to estimate cosmological parameters, unless 
the variabilities on s mall scales are know n to be extremely 
small. As shown by lAmara fc Refregiej (|2007bh and dis- 
cussed in PI, future all-sky cosmic shear surveys will need 



to achieve (cgys ) — 10~^ to be able to estimate wq and Wa 
with uncertainties of about 0.02 and 0.1, respectively. Fig. 
m shows that, for our PSF example, it is possible to achieve 
this accuracy when calibrating the PSF with 50 stars, if 
a > 4. Although this is not a general statement (this as- 
sumes that B can be described by a power-law function, see 
Eq. [24j and depends on the normalization parameter ^o), 
this is a representative example of the sparsity requirement 
for future cosmic shear surveys. On the other hand, for cur- 
rent cosmic shear surveys of areas ~ 50deg^, we have the 
requirement (cTgys ) — ^ ^ 10~^ (see lAmara fc Refregierl 
(l2007bD and PI). In this case, it is possible to calibrate the 
PSF with a few stars when a > 2. This sparsity requirement 
is reached with the current PSF corre ction methods (fo r 
instance that based on shapelets as in iBerge et al.1l2008[ ). 
and, assuming a star density of about 1 per arcmin^, this is 
consistent with the presence of significant B modes usually 
found on scales smaller than a few arcmins. 



6. Conclusions 

We explore the systematics induced in cosmic shear by the 
PSF calibration/correction process and study how to opti- 
mize the PSF model to minimize the systematic errors in 
cosmological parameter estimations. In this framework, we 
revisit the concept of the complexity of the PSF, defined 
in our previous paper (PI), and introduce the concept of 
the 'sparsity' of the PSF model. The complexity ^ charac- 
terizes the number of degrees of freedom in the model. A 
small number of degrees of freedom corresponds to a low 
^ and relates to a simple PSF model, which can be fitted 
to the stellar observations of low signal-to-noise ratio, but 
is likely to be highly biased. On the other hand, a large 
number of degrees of freedom corresponds to a high ^ and 
relates to a complex PSF model, which is expected to have 
a low bias but requires stellar observations of high signal- 
to-noise ratio to avoid large statistical scatters in the fitted 
parameters. In PI, we related the complexity ^ of the PSF 
model to the systematic errors in cosmological parameter 
estimations. In this paper, we show how the complexity 
can be optimized depending on the stars available, using 
the concept of sparsity. The sparsity characterizes the de- 
crease of residuals between the best fit of the PSF model 
on the underlying PSF, when adding degrees of freedom to 
the model. 

In the general case, we also extend the scaling relation, 
proposed in PI, between the number of stars used to cali- 
brate the PSF and the systematic errors in the cosmological 
parameter estimations. As discussed in PI, this relation, 
with the constraint of maintaining the systematics below 
the statistical uncertainties when estimating cosmological 
parameters, infers the number of stars N:^ required for the 
PSF calibration. corresponds to the minimum scale on 
which the PSF modeling is accurate: on scales smaller than 
this minimum, there is insufficent information in the data 
to calibrate the PSF variations. This implies that these 
smaller scales may be contaminated by systematics related 
to a poor PSF correction and should not be used when es- 
timating cosmological parameters (unless the variabilities 
are known to be small, due, for instance, to the quality of 
the hardware). 

We consider a realistic PSF example and model the 
amount of bias B between the PSF fit and the underly- 



ing PSF by a power-law function of the fitted complex- 
ity: B (X 1/^" where a is the sparsity parameter. We 
find that, for this PSF, current cosmic shear analyses that 
cover 50deg^ or less, need a to be higher than 2, which is 
achievable by current analysis methods. Thus, current cos- 
mic shear analyses do not require a rigorous optimization 
of the PSF model. On the other hand, future cosmic shear 
surveys that aim to measure wq and Wa to an accuracy of 
0.02 and 0.1, respectively, will require a > 4 to calibrate 
the PSF with 50 stars. This relation between the required 
number of stars and the accuracy of the calibration 
depends on the underlying PSF. This explains why these 
values, although corresponding to realistic orders of mag- 
nitude, cannot be assumed to represent a general result. 
Two parameters drive this relation: the amount of biases 
^0 when fitting the underlying PSF with a PSF model of 
low complexity (A^* being proportional to B^j^^ in our ex- 
ample, ^0 = 2 X 10~^), and the sparsity parameter a of the 
PSF modeling during the analysis. It is thus possible to op- 
timize cosmic shear surveys at two levels: when optimizing 
the observational conditions, the PSF must be as simple 
and stable as possible in order to make possible its descrip- 
tion by a low complexity model (this minimizes 5o); when 
analysing the data, the PSF modeling must be optimized 
to have as high a value of the sparsity a as possible. 

The approach suggested in this paper is a first step to- 
ward introducing the concept of sparsity to weak lensing 
shape measurements. We do not address issues related to 
the pixelation. Moreover, although we only address the PSF 
calibration, this approach is also applicable to other topics 
such as description of galaxy shapes. 
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